Per connection local rate limiting#15843
Conversation
|
Hi @gokulnair, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
|
@gokulnair mind addressing the format error: I'll review the rest in the meanwhile, thanks! |
rgs1
left a comment
There was a problem hiding this comment.
Did a first pass, let's also add docs and tests. Thanks!
There was a problem hiding this comment.
Let's put a long description here explaining what this knob does, when it might be a good idea to use it, etc, etc.
There was a problem hiding this comment.
Would it make sense to make this name a little more descriptive, such as rate_limiter_per_connection maybe?
There was a problem hiding this comment.
Yes, I think that would help. I think rate_limiter_per_connection is good, given that we have connection_pool_per_downstream_connection in the cluster msg definition. Alternatively apply_per_connection is a suggestion if you are unconvinced.
There was a problem hiding this comment.
Alternatively, per_connection_local_rate_limiter is also good. I think I like that one the most.
source/extensions/filters/http/local_ratelimit/local_ratelimit.h
Outdated
Show resolved
Hide resolved
source/extensions/filters/http/local_ratelimit/local_ratelimit.h
Outdated
Show resolved
Hide resolved
193f1d5 to
95054bc
Compare
5eab75d to
e83052e
Compare
7d0fc2b to
23bf090
Compare
|
Thanks for the reviews! Let me know if there's anything else that's needed to get this going. |
alyssawilk
left a comment
There was a problem hiding this comment.
Oops, I'm sorry about review lag - Matt would usually take the lead here but his leave got unexpectedly extended. I'll pick it up instead!
api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto
Show resolved
Hide resolved
Thanks @alyssawilk! |
alyssawilk
left a comment
There was a problem hiding this comment.
Looks totally solid! Just a few minor nits, and thanks for your patience both with review lag and me getting a handle on use case :-)
| EXPECT_EQ(1U, findCounter("test.http_local_rate_limit.rate_limited")); | ||
| } | ||
|
|
||
| TEST_F(FilterTest, RequestRateLimitedPerConnection) { |
There was a problem hiding this comment.
WDYT of adding an integration test? I think you can crib off of existing tests in ratelimit_integration_test.cc plus some of the h2 tests to make sure if you stick a limit of one stream, that a second stream on that connection won't (immediately) go upstream but a stream on a separate connection will pass through.
There was a problem hiding this comment.
Sure, I can take a look at adding an integration test for the above scenario. The remaining comments should be addressed and good to go.
There was a problem hiding this comment.
Ah! After taking a closer look, seems like the tests in ratelimit_integration_test.cc are all targeting the 'rate limiter as an external service' use case and it makes complete sense to have integration tests for that.
For local rate limiting within envoy however, doesn't really look like we have any integration tests and I'm wondering if it buys us anything more than what the unit tests do currently. Thoughts? cc @rgs1
There was a problem hiding this comment.
I think integration tests generally have value - I wasn't convinced setting connection data on the per-request StreamInfo would work as expected (there are no currently filters which do this, but I went ahead and tweaked some existing code to verify it worked for my own satiscation =P) but given the review delay this PR already suffered and lack of local rate limit tests to crib off of, I'm inclined to let it go this once though if you want to do a follow-up you'd totally earn brownie points :-)
There was a problem hiding this comment.
Brownie points it is then :-) I do have a draft up but I'm fighting the integration test api a bit at the moment, specifically with trying to get it to send multiple requests over the same connection while still ensuring that the first one generates an upstream request over its fake_upstreams_ while the subsequent locally rate limited ones don't. It's holding it up more than necessary so a follow up task is making more sense at this point.
Thanks for taking a look!!
| const Filters::Common::LocalRateLimit::LocalRateLimiterImpl& Filter::getRateLimiter() { | ||
| if (!decoder_callbacks_->streamInfo().filterState()->hasData<PerConnectionRateLimiter>( | ||
| PerConnectionRateLimiter::key())) { | ||
|
|
| } | ||
|
|
||
| if (config->requestAllowed(descriptors)) { | ||
| const bool is_request_allowed = config->rateLimitPerConnection() |
There was a problem hiding this comment.
I'd either put the config->request(allowed) vs getRatelimiter().requestAllowed switch in Filter::requestAllowed or rename requestAllowed to indicate it only applies to per-connection-rate-limited requests.
ditto for getRateLimiter. getConnectionRateLimiter? with an assert that config->rateLimitPerConnection()?
antoniovicente
left a comment
There was a problem hiding this comment.
Looks good, just some minor comments / nits.
| // If set to true, a token bucket is allocated for each connection. | ||
| // Thus the rate limits are applied per connection thereby allowing | ||
| // one to rate limit requests on a per connection basis. | ||
| bool local_rate_limit_per_downstream_connection = 11; |
There was a problem hiding this comment.
Is it possible for a proxy to have both per-process and per downstream connection rate limits, or just one or the other?
Another potential question is if we should consider some additional rate limit criteria in the future like downstream IP or HTTP Cookie. If we expect additional criteria in the future, we may want to make this an enum field.
There was a problem hiding this comment.
It's definitely one or the other as it stands currently as I'm not sure the added complexity of dealing with potentially conflicting token bucket quotas between the per process and per connection configurations and the precedence rules we'd have to handle buys us much in the way of functionality.
We did indeed consider an enum initially but it didn't seem like there were very many realistic use cases mainly because a lot of the toggles that were based on certain request characteristics such as IP, Cookie etc can be handled today by rate limiting on request descriptors ...
| if (config->requestAllowed(descriptors)) { | ||
| const bool is_request_allowed = config->rateLimitPerConnection() | ||
| ? requestAllowed(descriptors) | ||
| : config->requestAllowed(descriptors); |
There was a problem hiding this comment.
Consider moving this branch to the requestAllowed method and call the getRateLimiter() or config requestAllowed method based on config.
It would be fine to pass config as an argument to Filter::requestAllowed
| return getRateLimiter().requestAllowed(request_descriptors); | ||
| } | ||
|
|
||
| const Filters::Common::LocalRateLimit::LocalRateLimiterImpl& Filter::getRateLimiter() { |
There was a problem hiding this comment.
naming nit: getPerConnectionRateLimiter
| header: | ||
| key: x-local-ratelimited | ||
| value: 'true' | ||
| local_rate_limit_per_downstream_connection: {} |
There was a problem hiding this comment.
Seems like the {} is used in substitutions below. It would be good to add an end-of-line comment explaining the {}. I think that the yaml comment character is #
There was a problem hiding this comment.
Added an explanation below.
| filter_->decodeHeaders(request_headers, false)); | ||
| EXPECT_EQ(Http::FilterHeadersStatus::Continue, filter_2_->decodeHeaders(request_headers, false)); | ||
| EXPECT_EQ(Http::FilterHeadersStatus::StopIteration, | ||
| filter_2_->decodeHeaders(request_headers, false)); |
There was a problem hiding this comment.
A comment explaining the differences between this test and FilterTest.RequestRateLimited may be helpful for future readers. I think this shows that the limit is applied by connection by verifying that filter_2_ allows requests after filter_ hits the rate limit.
There was a problem hiding this comment.
Added a brief explanation to the test.
Thanks! |
Signed-off-by: Gokul Nair <gnair@twitter.com>
|
I've addressed all comments/nits. Will take a look at adding the integration test shortly. |
Signed-off-by: Gokul Nair <gnair@twitter.com>
antoniovicente
left a comment
There was a problem hiding this comment.
Looks good. I'm sure that the integration test you're looking into would make it even better.
Thanks!
|
/retest |
|
Retrying Azure Pipelines: |
alyssawilk
left a comment
There was a problem hiding this comment.
LGTM
Could you please update the PR description with the appropriate fields (testing: unit tests, docs: inline etc) and then I'll let @antoniovicente merge when he's set.
| EXPECT_EQ(1U, findCounter("test.http_local_rate_limit.rate_limited")); | ||
| } | ||
|
|
||
| TEST_F(FilterTest, RequestRateLimitedPerConnection) { |
There was a problem hiding this comment.
I think integration tests generally have value - I wasn't convinced setting connection data on the per-request StreamInfo would work as expected (there are no currently filters which do this, but I went ahead and tweaked some existing code to verify it worked for my own satiscation =P) but given the review delay this PR already suffered and lack of local rate limit tests to crib off of, I'm inclined to let it go this once though if you want to do a follow-up you'd totally earn brownie points :-)
I don't have remaining comments. The remaining question is wherever or not integration tests are in the work as part of this PR or a followup. A followup makes the most sense to me at this point, we should go ahead and merge. |
I do agree that a follow task makes the most sense at this time (I do have a draft up but its still WIP) |
@htuch Mind approving the api change once again, please? (we updated the doc in the proto file) |
This is a PR for scoping token buckets in the local rate limiting flow on a per connection basis as opposed to scoping it on the entire envoy instance. More details in envoyproxy#15637 Currently, the HTTP local rate limiter's token bucket is shared across all workers, thus causing the rate limits to be applied per Envoy instance/process. This could potentially result in bad actors quickly exhausting limits on a given envoy instance before legitimate users have had a fair chance. We achieve this by adding an instance of the LocalRateLimit::LocalRateLimiterImpl to each connection object, if there isn't one already, via FilterState data Risk Level: Low Testing: Added unit tests to local_ratelimit Manually tested via curl'ing against a locally patched envoy instance. One can send multiple requests on the same connection via curl using the following: curl -vI example.com example.com Docs Changes: Added new toggle to local rate limit configuration to enable per connection local rate limiting // Specifies the scope of the rate limiter's token bucket. // If set to false, the token bucket is shared across all worker threads // thus the rate limits are applied per Envoy process. // If set to true, a token bucket is allocated for each connection. // Thus the rate limits are applied per connection thereby allowing // one to rate limit requests on a per connection basis. // If unspecified, the default value is false. bool local_rate_limit_per_downstream_connection Sample configuration typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit stat_prefix: http_local_rate_limiter token_bucket: max_tokens: 10000 tokens_per_fill: 1000 fill_interval: 1s filter_enabled: runtime_key: local_rate_limit_enabled default_value: numerator: 100 denominator: HUNDRED filter_enforced: runtime_key: local_rate_limit_enforced default_value: numerator: 100 denominator: HUNDRED response_headers_to_add: - append: false header: key: x-local-rate-limit value: 'true' local_rate_limit_per_downstream_connection: true Fixes envoyproxy#15637 Signed-off-by: Gokul Nair <gnair@twitter.com>
Commit Message:
This is a PR for scoping token buckets in the local rate limiting flow on a per connection basis as opposed to scoping it on the entire envoy instance. More details in #15637
Additional Description:
Currently, the HTTP local rate limiter's token bucket is shared across all workers, thus causing the rate limits to be applied per Envoy instance/process. This could potentially result in bad actors quickly exhausting limits on a given envoy instance before legitimate users have had a fair chance. We achieve this by adding an instance of the
LocalRateLimit::LocalRateLimiterImplto each connection object, if there isn't one already, viaFilterStatedataRisk Level: Low
Testing:
Added unit tests to local_ratelimit
Manually tested via curl'ing against a locally patched envoy instance.
One can send multiple requests on the same connection via curl using the following:
curl -vI example.com example.comDocs Changes:
Added new toggle to local rate limit configuration to enable per connection local rate limiting
Sample configuration
[Optional Fixes #Issue]
#15637